It's a known fact that philosophy is hard to understand, yet it's still important for us to catch a glimpse of it because once we understand a person's worldview, there is no limit to what we can do with that information - from advertising to political campaigning through to self-exploration and therapy. What's given below are some basic analysis of data regarding philosophy.
import pandas as pd
import numpy as np
import nltk
import matplotlib.pyplot as plt
import seaborn as sb
from wordcloud import WordCloud, STOPWORDS
df=pd.read_csv('philosophy_data.csv')
df.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 360808 entries, 0 to 360807 Data columns (total 11 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 title 360808 non-null object 1 author 360808 non-null object 2 school 360808 non-null object 3 sentence_spacy 360808 non-null object 4 sentence_str 360808 non-null object 5 original_publication_date 360808 non-null int64 6 corpus_edition_date 360808 non-null int64 7 sentence_length 360808 non-null int64 8 sentence_lowered 360808 non-null object 9 tokenized_txt 360808 non-null object 10 lemmatized_str 360808 non-null object dtypes: int64(3), object(8) memory usage: 30.3+ MB
df.head()
| title | author | school | sentence_spacy | sentence_str | original_publication_date | corpus_edition_date | sentence_length | sentence_lowered | tokenized_txt | lemmatized_str | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Plato - Complete Works | Plato | plato | What's new, Socrates, to make you leave your ... | What's new, Socrates, to make you leave your ... | -350 | 1997 | 125 | what's new, socrates, to make you leave your ... | ['what', 'new', 'socrates', 'to', 'make', 'you... | what be new , Socrates , to make -PRON- lea... |
| 1 | Plato - Complete Works | Plato | plato | Surely you are not prosecuting anyone before t... | Surely you are not prosecuting anyone before t... | -350 | 1997 | 69 | surely you are not prosecuting anyone before t... | ['surely', 'you', 'are', 'not', 'prosecuting',... | surely -PRON- be not prosecute anyone before ... |
| 2 | Plato - Complete Works | Plato | plato | The Athenians do not call this a prosecution b... | The Athenians do not call this a prosecution b... | -350 | 1997 | 74 | the athenians do not call this a prosecution b... | ['the', 'athenians', 'do', 'not', 'call', 'thi... | the Athenians do not call this a prosecution ... |
| 3 | Plato - Complete Works | Plato | plato | What is this you say? | What is this you say? | -350 | 1997 | 21 | what is this you say? | ['what', 'is', 'this', 'you', 'say'] | what be this -PRON- say ? |
| 4 | Plato - Complete Works | Plato | plato | Someone must have indicted you, for you are no... | Someone must have indicted you, for you are no... | -350 | 1997 | 101 | someone must have indicted you, for you are no... | ['someone', 'must', 'have', 'indicted', 'you',... | someone must have indict -PRON- , for -PRON- ... |
Numeric featuresof the data
fig=plt.figure(figsize=(10,10))
df.school.value_counts().plot.bar()
<AxesSubplot:>
df.sentence_length.describe()
plt.figure(figsize=(16,6))
df.sentence_length.plot(kind='hist',bins=300)
plt.title('sentence length frenquency')
plt.show()
schools=df.school.unique().tolist()
plt.figure(figsize=(16,6))
sb.violinplot(x='school', y='sentence_length', data=df)
plt.title('sentence length grouped by school')
plt.grid()
authors=df.author.unique().tolist()
plt.figure(figsize=(30,6))
sb.violinplot(x='author', y='sentence_length', data=df)
plt.title('sentence length grouped by author')
plt.grid()
Here we conduct a wordcloud analysis to better visualize the frequency of the word in each school
File "<ipython-input-34-5be4c005fdc6>", line 1 Here we conduct a wordcloud analysis to better visualize the frequency of the word in each school ^ SyntaxError: invalid syntax
stopwords=set(STOPWORDS)
for sc in schools:
df_temp=df[df.school==sc]
print('School= ', sc.upper(), ":")
text=''.join(txt for txt in df_temp.sentence_lowered)
wordcloud=WordCloud(stopwords=stopwords, max_font_size=80, max_words=300,
width=600, height=400, background_color='white').generate(text)
plt.figure(figsize=(12,8))
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis('off')
plt.show()
School= PLATO :
School= ARISTOTLE :
School= EMPIRICISM :
School= RATIONALISM :
School= ANALYTIC :
School= CONTINENTAL :
School= PHENOMENOLOGY :
School= GERMAN_IDEALISM :
columns=df.school.unique()
index=df.original_publication_date.sort_values().unique()
columns,index
(array(['plato', 'aristotle', 'empiricism', 'rationalism', 'analytic',
'continental', 'phenomenology', 'german_idealism', 'communism',
'capitalism', 'stoicism', 'nietzsche', 'feminism'], dtype=object),
array([-350, -320, 125, 170, 1637, 1641, 1674, 1677, 1689, 1710, 1713,
1739, 1776, 1779, 1781, 1788, 1790, 1792, 1798, 1807, 1817, 1820,
1848, 1862, 1883, 1886, 1887, 1888, 1907, 1910, 1912, 1921, 1927,
1936, 1945, 1949, 1950, 1953, 1959, 1961, 1963, 1966, 1967, 1968,
1972, 1975, 1981, 1985], dtype=int64))
trend_matrix=pd.DataFrame(np.zeros((len(index),len(columns))),index=index,columns=columns)
for i in range(len(index)):
for j in range(len(columns)):
try:
trend_matrix.iloc[i,j]=df[df.original_publication_date==index[i]]['school'].value_counts()[columns[j]]
except KeyError:
trend_matrix.iloc[i,j]=0
trend_matrix
| plato | aristotle | empiricism | rationalism | analytic | continental | phenomenology | german_idealism | communism | capitalism | stoicism | nietzsche | feminism | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| -350 | 38366.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| -320 | 0.0 | 48779.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 125 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 323.0 | 0.0 | 0.0 |
| 170 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2212.0 | 0.0 | 0.0 |
| 1637 | 0.0 | 0.0 | 0.0 | 340.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1641 | 0.0 | 0.0 | 0.0 | 792.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1674 | 0.0 | 0.0 | 0.0 | 12997.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1677 | 0.0 | 0.0 | 0.0 | 3793.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1689 | 0.0 | 0.0 | 8885.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1710 | 0.0 | 0.0 | 1040.0 | 5027.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1713 | 0.0 | 0.0 | 1694.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1739 | 0.0 | 0.0 | 7047.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1776 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 11693.0 | 0.0 | 0.0 | 0.0 |
| 1779 | 0.0 | 0.0 | 1265.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1781 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 7472.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1788 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2452.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1790 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 4204.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1792 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2559.0 |
| 1798 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 5308.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1807 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 7099.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1817 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 10678.0 | 0.0 | 3090.0 | 0.0 | 0.0 | 0.0 |
| 1820 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 4923.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1848 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 493.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1862 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 4469.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1883 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 12996.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1886 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1906.0 | 0.0 |
| 1887 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 5916.0 | 0.0 |
| 1888 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 5726.0 | 0.0 |
| 1907 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 910.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1910 | 0.0 | 0.0 | 0.0 | 0.0 | 3668.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1912 | 0.0 | 0.0 | 0.0 | 0.0 | 1560.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1921 | 0.0 | 0.0 | 0.0 | 0.0 | 4725.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1927 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 8505.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1936 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 4832.0 | 0.0 | 0.0 | 3411.0 | 0.0 | 0.0 | 0.0 |
| 1945 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 7592.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1949 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 13017.0 |
| 1950 | 0.0 | 0.0 | 0.0 | 0.0 | 9357.0 | 0.0 | 6734.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1953 | 0.0 | 0.0 | 0.0 | 0.0 | 5838.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1959 | 0.0 | 0.0 | 0.0 | 0.0 | 4678.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1961 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 8033.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1963 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2518.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1966 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 4689.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1967 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 5999.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1968 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 5861.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1972 | 0.0 | 0.0 | 0.0 | 0.0 | 2681.0 | 6679.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1975 | 0.0 | 0.0 | 0.0 | 0.0 | 9798.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1981 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 3059.0 |
| 1985 | 0.0 | 0.0 | 0.0 | 0.0 | 13120.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
fig=plt.figure(figsize=(10,10))
ax=fig.add_subplot(1,1,1)
trend_matrix.plot(ax=ax)
<AxesSubplot:>
From the graph above, we can easily draw a conclusion that the concept of Aristotle took the mainstream right at the beginning before it began to waine. In 17th century, various idea flourished and this trend kept about 400 years. One fun fact is that there is no No philosophical school that can last forever.